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THE GAUSSIAN LAW OF ERROR FOR ANY 
NUMBER OF VARIABLES* 

BY 

J. L. COOLIDGE 

The exponential law for the distribution of accidental errors of observation, 
discovered by Gauss, has been a mathematical classic for over a century. 
Many have been the attempts to prove it, all based, necessarily, on more or 
less arbitrary assumptions. Perhaps the most searching examination of it was 
given by Poincare' in his Calcul des probabilites; his final opinion seems to be 
contained in the following phrase: f 

" J'ai plaide de mon mieux jusqu'ici en f aveur de la loi de Gauss dont nous 
allons maintenant tirer les consequences. Peut-etre pourtant la cause n'etait- 
elle pas parfaitement bonne. 

"Elle ne s'obtient pas par des deductions rigoureuses, plus d'une demon- 
stration qu'on a voulu en donner est grossiere, entre autres celle qui s'appuie 
sur 1'amrmation que la probability des ecarts est proportionnelle aux ecarts. 
Tout le monde y croit cependant, me disait un jour M. Lippmann, car les 
experimenteurs s'imaginent que c'est un theoreme de math^matiques, et les 
mathematiciens, que c'est un fait experimental." 

The law has been extended to include the distribution of errors depending 
upon two variables, and in this form it has a certain importance in the theory 
of ballistics, and in that of statistical correlation; even the case of three vari- 
ables has been slightly treated. The general case of n variables has never 
been taken up except in two recent articles by von Mises.J The treatment 
here is based on a very general form of analysis showing how an arbitrary 
distribution function will lead asymptotically to an exponential form. The 
analysis is very careful, the point of view extremely abstract, with little 
relation to practical applications. Moreover, the author gives no indication 
how the constants should be calculated in any particular case. It is the 
object of the present paper to deduce the Gaussian law for n variables by a 
method based upon the classical one for a single variable, but with somewhat 
broader and more explicit assumptions. In the second part we shall make 

* Presented to the Society, December 27, 1922. 

t Poincar6, Calcul des Probabilites, Paris, 1896, pp. 196 and 149. 

J Fundamentalsatze der Wahrscheinlichkeitsrechnung, Mathematische Zeit- 
s c h r i f t , vol. 4 (1919), and Grundlagen der Wahrscheinlichkeitsrechnung, ibid., vol. 5 
(1920). See also Dodd, Functions of measurements, Skandinavisk Aktuarie- 
tidskrift, 1922. 
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the additional assumptions necessary to determine the coefficients in any 
particular case, and show how these latter may then be calculated in a simple 
manner. 

1. The deduction of the law 

Suppose that we are concerned with measuring groups of m quantities. 
We shall, for simplicity, assume that all groups are equally trustworthy, 
although the extension to the case of differently weighted groups is not difficult. 
We shall make certain assumptions about the distribution of errors, meaning, 
thereby, accidental errors, for we assume that constant errors have been 
removed. 

Assumption 1. The a priori probability that a group of quantities to be 
measured should take values in the infinitesimal region 

X±±dX, Y±W, Z±±dZ, 

where the points X , Y , Z , • ■ • lie in a certain continuous m-dimensional manifold 
S , will differ by an infinitesimal of higher order from the expression 

f(X, Y,Z, --^dXdYdZ---, 

where the function f is continuous with continuous first derivatives in S . 

Assumption 2. The probability that a group of quantities whose true values 
are X , Y , Z , • • • in S should be observed to have values, after the removal of 
constant errors, which lie in the infinitesimal region 

x db \dx , y ± \dy , z±\dz, ■■■ , 

where (x, y , z, • • • ) is a point of S , will differ by an infinitesimal of higher 
order from 

$ ( X , Y , Z , • • • , x, y , z, •••) dxdydz • • • , 

where the function $ is continuous with continuous first and second partial 
derivatives, and has a value independent of the choice of origin. 

The last part of the assumption is plausible in practice, because if we are, 
for instance, measuring a length on a scale, the accidental errors will arise 
from various physical causes independent of the position of the . Moreover, 
it has a momentous consequence, for 

*(Z, Y,Z, ■■■,x,y,z, •••) 

= $|0,0,0, ■■■,x-X,y- Y,z-Z, ■■■{. 

Writing in the explicit values of the errors we have 

% = x-X, v = y - Y , f = z - Z, 

that is to say, the probability for a system of errors is a function of those 
errors, and not of the true values and observed values considered as independent 
variables, a point which has been a stumbling block to some writers. 
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Assumption 3. The mean value for the error on an individual variable is . 

This again is plausible, for a contrary assumption would show a tendency to 
favor positive or negative errors, and such a tendency we should naturally 
class with the constant errors, not with the accidental ones. As a further 
matter of notation let us write the averages 
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Assumption 4. If the infinitesimal increments dx, dy , dz, • • • be sufficiently 
small, the probability that the true values lie in the region 

x db \dx , y db \dy , z ±%dz, ■•• 

is greater than that they lie in any other region of like structure about any other 
-point. 

We have now made a sufficient number of assumptions to enable us to 
deduce the analytic form for our functions. We do this, following the original 
method of Gauss, by calculating the probability that a given set of observa- 
tions should have resulted from observing a group of quantities of assumed 
true value. The probability that the measurements x\ , yi , zi , • • • , x<i , y<i , z 2 , 
• • • , Xn , y n , z n , • • • were made on quantities whose true values are X , Y , Z , 
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• • • is by Bayes' theorem 

f(X,Y,Z, ...)*i*2--- QndXdYdZ ■■■ 



f ■ ■ ■ ff {X ,Y ,Z , ■ ■ -)^^ ■ ■ ■ ^ n dXdYdZ • ■ ■ 

*.■ = *(&, 17.- , r», •••)• 

The integration in the denominator is supposed to be extended throughout 
the whole region S . This expression will be a maximum with the logarithm 
of its numerator. Equating to the partial derivatives to X , Y , Z , • • • , 
we get 

_ d log/ , d log #i d log <gg , , d log $„ = „ 

ax ^ ah ^ a§ 2 z 1 " ••• -i- 3£„ """ ' 
a log / a log $1 a log $2 , . .. a log $ ra = 

One set of solutions will arise in case all of the observations have been 
correct, i.e., 

_ghg/ +n aiog*(£ ,,•■■ _) = = = ... = 0) 

_aM/ +B aiog*(€^,-) ga0> ^ = , = ... =0 . 

Now, by definition, / is independent of n , hence 
d log/_ dlog/. 



ax ay 



= • • • = , / = const. 



Let us underline the fact that we are considering probabilities for observa- 
tions which do not go outside of the region S . We could not have / a constant 
throughout all space without a contradiction, and it is also evident that / 
must be rigorously throughout most of space. The partial differential 
equations now take the simpler form 

a log $i a log $2 .... , d log $ n = 

a£i a& a£„ 

ajogj>i ajogjh , . . . , ajogj^ = Q 



These equations hold whenever 

X = x, Y = y, Z = z, 

& + & + ••• +& = 0, 

171 + *72 + • ■ • + i?» = 0, 
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By Assumption 2 we are free to treat our assumed groups as if they were 
absolutely independent quantities, provided we do not go outside of S . Let 
us, then, assume that the observed groups x\, x 2 , • • • , x n ; y%, y 2 , • • • , y n ; 
Zi,Z2, • • • ,z n take such infinitesimal increments that the averages x , y , z , 
in (1) are not altered: 
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We have also 

5?7i + 5?72 + • • • + <5t7„ = 0, 



Each of the first set of equations in the variables d%\ , d& , • • • , d£ n must 
hold whenever the last equation in these variables holds, hence, integrating 
once, and dropping subscripts, 

Giving | the successive values £i , £2 , ••-,£«, and doing the same for the other 
variables, and then summing, we have 

o> = 0, 
d log $ 
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Here the expression i^ 2 is a homogeneous quadratic form in the variables. 

Our assumptions are sufficient to enable us to make a very definite statement 
about the function \(/ 2 , namely, that its discriminant is not zero. For if the 
discriminant were zero, the partial derivatives would be linearly dependent, 
and vanish for an infinite number of sets of values for the variables, and this 
is directly in conflict with our fourth assumption that the only maximum 
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arose from taking all of these variables equal to zero. Furthermore, since 
this is a maximum, we know that the form is definite, i.e., 

The homogeneous quadratic form t/' 2 is positive and definite, with a non-vanishing 
discriminant. 

2. Determination of the constants* 

It should be emphasized that everything which we have done so far is under 
the assumption that we are dealing with observations in the region S . We 
have found the probability that an observation in the region S should lie in 
a certain infinitesimal sub-region. In practice this is of no interest whatever 
until we have some idea of what the region S may be. It certainly could not 
be the whole of space, as the assumption that / is everywhere constant leads 
to a contradiction. On further consideration we notice two things. First of 
all, it seems quite plausible that / might be constant throughout a certain 
region, and equal to almost everywhere else. Second, the expression (6) 
is excessively small, except in a very strictly confined space, and this rapid 
diminution of (6) would produce a result close to that of the vanishing of/. 
In other words, the error in calculating the constants will be very small if we 
assume that the formula (6) is universally valid. On the strength of this 
we make 

Assumption 5. For the purpose of calculating constants, formula (6) may be 
assumed true throughout all space. 

We note, secondly, that the only method for calculating our constants is 
to assume that certain observed values may be identified with their mean 
values as calculated by formula. The right sides of the last two equations 
(4) are proportional to the mean values of the averages of certain observed 
quantities, and we know by Tchebycheff's theoremf that it is highly likely 
that the value of an average shall be close to its mean value. This leads to 

Assumption 6. When the number of groups is large, the mean values of 
£ 2 > V 2 > £ > V > • • - m a y be equated to the observed values 

Hitf Hi *j Hi &3 e i 

, , , . . . t 

n — 1 n — 1 re— 1 
These quantities give, of course, the probable errors of individual measure- 



* The mathematical manipulation that follows depends on obvious applications of the 
theory of determinants. The methods and final formula are very close to Greiner, Z e i t - 
schrift fur Mathematik und Physik, vol. 57, pp. 226 ff., and Pearson, Philo- 
sophical Transactions of the Royal Society, vol. 187, pp. 299 ff. Pearson 
assigns the credit to Edgeworth, Philosophical Magazine, ser. 5, vol. 34, p. 201. 
I must confess to finding Edgeworth so obscure that I do not know whether his result is like 
mine or not. Moreover, none of these writers seem to me to set forth the underlying assump- 
tions with desirable clearness. 

t Tchebycheff, Oeuvres, Petrograd, 1899, vol. 1, p. 687. 
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ments and their correlation coefficients two by two. It is interesting that 
these should be the only independent constants. 

In order to clarify the manipulation, we shall at this point take the perilous 
step of changing our notation. Logically, the notation we are now going to 
adopt might well have been used from the start, but the resulting summation 
formulas, by their very compactness, are rather obscure, and it is easier to 
see what is really going on, by using the more diffuse symbolism with many 
continuation signs which we have employed so far. For the actual errors 
committed, we shall write 

(7) £ = a*i, i? = x 2 , f = x s , 

there being in in all. The n sets of m residuals shall be written 

fill j 812 , • • • , Sl m ; §21, §22) • • • , $2m', ' ' ' ', fi»l j Sn2 > ' ' ' > ° nm . 

Our fundamental formula (6) may now be written 

(8) $ = BaT*** , a ij = a ji . 

The assumptions 5 and 6 may be expressed by the equation 

(9) Pii = ^ k h \ h} = R f X •■■ r Xi Xj e -*w dx! dxf- dx m . 

n - 1 J_ x J_ x 

Since the discriminant of our quadratic form is not zero, we may find a 
linear transformation 
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Since the discriminant is an invariant of weight 2 , 
b\ b<i • • • b m = | dj | • | ctij I . 
The inverse of the substitution, contragredient to (10), is 

V)' k = X) °ik Wi , 

i 

w' k = X) °ik c ik Wi Wj • 
'J 

In the projective space of n — 1 dimensions where a point has the homogeneous 
coordinates Xi , x 2 , • ■ ■ , x n the hyperquadric 

/ j C*ij Xi Xj ■ — \J 
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has the tangential equation 

X) An Wi wj = 0, Aij = I "* ■ 
In terms of the new variables we have 

r r O r 

,' 2 
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r r i,j 
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We may express (9) in terms of the new variables. The only point to re- 
member is that the jacobian of the transformation is | ca | ; 

• • • I Z_jCik Cji x'u x'i e r dx[ dx' 2 • ■ ■ dx' m . 

00 V -~ 00 ^> * 

This simplifies greatly because 
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We have the well known integrals 
Hence 
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Furthermore, since the probability of some group of errors is 1 , 
1 = R ("° ■ ■ ■ r e -^ XiX ' dxi ■ ■ ■ dx n , 
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Dividing out R , we have 

Pa = 



2 1 a,ij | 



In these equations the quantities pa are known; we wish to find the quantities 
<x,7 . We first introduce one more symbol : 

P _ d \Pa\ 

* ij 



dpa 



Since the process of interchanging each element of a non-vanishing deter- 
minant with its cofactor is an involutory one, except for multiplication by a 
power of the determinant, we must have 

We calculate M by a little jugglery: 



\a,ij\= M m \p 



U..I = \ Ai ' 

\P t '\ Om I „ 



m— 1 
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\aij\ m 2 m \aij\ 
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Let us exhibit our results in the form of a table : 
Assumed errors: x\ , x 2 , • • ■ , x n . 
Gaussian law of error for n variables : <£> = Re~ %a « x ^ . 
Observed residuals: 5 U , 5i 2 , • • • , 5i m ; § 2 i , 5 22 , • • ■ , 5 2m ; • • • ; S„i , d n2 , ■ ■ ■ , S„ 

„ - V bilh' . p - d I Pa I . 
t n — 1 apij 

P ■ 1 

n ■ . — %1 • Vl — 



2 \Pij\ ' V(27r) m |pi 

Harvaed University, 
Cambridge, Mass. 
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